Using the dataset obtained from FSU’s Florida Climate Center, for a station at Tampa International Airport (TPA) for 2022, attempt to recreate the charts shown below which were generated using data from 2016. You can read the 2022 dataset using the code below:
library(tidyverse)
weather_tpa <- read_csv("https://raw.githubusercontent.com/reisanar/datasets/master/tpa_weather_2022.csv")
# random sample
sample_n(weather_tpa, 4)
## # A tibble: 4 × 7
## year month day precipitation max_temp min_temp ave_temp
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2022 6 18 0 98 81 89.5
## 2 2022 4 3 0 80 68 74
## 3 2022 12 20 0.11 67 58 62.5
## 4 2022 5 22 0 96 74 85
Using the 2022 data:
Hint: the option binwidth = 3 was used with the
geom_histogram() function.
{r} library(lubridate) weather_tpa$month <- month(weather_tpa$month, label = T, abbr = F)
{r} ggplot(data=weather_tpa)+ geom_histogram(aes(x=max_temp, fill=month),binwidth =3,,col=I("white"), show.legend = FALSE)+ facet_wrap(~ month, nrow = 3, ncol = 4)+ theme(legend.position="none")+ scale_y_continuous(limits = c(0, 20))+ scale_x_continuous(limits = c(60, 90))+ ylab("Number of Days")+ xlab("Maximum Temperature")+ theme_bw()
(b) Create a plot like
the one below:
Hint: check the kernel parameter of the
geom_density() function, and use bw = 0.5.
{r} density_plot<-ggplot(data=weather_tpa,aes(x=max_temp))+ geom_density(bw = 0.5,kernel = "epanechnikov",color="black",fill="#7c7c7c")+ theme(legend.position="none")+ scale_x_continuous(limits = c(60, 90))+ ylab("Number of Days")+ xlab("Maximum Temperature")+ theme_minimal()
Hint: default options for geom_density() were used.
{r} density_plot2022<-ggplot(data=weather_tpa,aes(x=max_temp,fill=month,col=I("black")))+ geom_density(bw = 1,kernel = "epanechnikov", nrow =3, ncol = 4, alpha=.8)+ facet_wrap(~ month)+ labs(x = "Maximum temperatures", y = " ", title = "Density plot for each month in 2022")+ scale_x_continuous(limits = c(60, 90))+ theme_bw()+ theme(legend.position="none")
Hint: use the{ggridges} package, and the
geom_density_ridges() function paying close attention to
the quantile_lines and quantiles parameters.
The plot above uses the plasma option (color scale) for the
viridis palette.
{r} ridges <-ggplot(weather_tpa, aes(x = max_temp, y = month, fill = stat(x))) + geom_density_ridges_gradient(quantile_lines = TRUE, quantiles = c(0.5)) + scale_fill_viridis_c(name = "", option = "C") + theme_minimal()+ labs(x = "Maximum temperature (in Fahrenheit degrees)", y = "")+ theme_minimal()
{r} weather_tpa <- weather_tpa %>% group_by(month) %>% mutate( avg_precipitation = mean(precipitation) )
{r} ggplot(data=weather_tpa, mapping = aes( x=month,y=avg_precipitation,fill=month)) + geom_pointrange(aes(ymin=0, ymax=avg_precipitation),show.legend =FALSE)+ labs(y = "precipitation", title="Average Precipitation in 2022")+ coord_flip()+ theme_classic()
Since this dataset is for Tampa Florida I wanted to see if a lollipop chart can show how much it rains during summer/hurrican season. Here you can clearly see the average rain pick up during the summer and get worse a bit into hurricane seasonn
Review the set of slides (and additional resources linked in it) for visualizing text data: https://www.reisanar.com/slides/text-viz#1
Using the billboard top 100 Lyrics I wanted to create a word map of the most used words based on the lyrics from Billboard Top 100 - Billboard Top 100 Lyrics
First I need read in the data and save it
{r} download.file("https://raw.githubusercontent.com/reisanar/datasets/master/BB_top100_2015.csv","billboard.csv")
{r} billboard<-read_csv("../data/billboard.csv")
For this visualization I want to make a word map using the lyrics to see which words are used the most in popular songs
{r} library(wordcloud2) library(geniusr) library(tidytext)
using the function unnest_tokens I was able to easily
seperate the words from the lyrics column
{r} billboard<-billboard %>% unnest_tokens(word, Lyrics)
After I seperated the words out I wanted to remove the stop words
```{r} billboard<-billboard %>% anti_join(stop_words)
now that I have my list of words I counted how many times each word appeard
```{r}
billboard <- billboard %>%
count(word, sort = TRUE)
wordcloud2 made it easy to create a word cloud of the
most used words. I set a seed to make sure the word cloud is
reproducable.
{r} set.seed(1031) wordcloud2(data=billboard)